-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use skiplists to save on the bitmap size when compressing leaves #454
base: jsign-type-3
Are you sure you want to change the base?
Conversation
@@ -170,6 +170,7 @@ const ( | |||
leafType byte = 2 | |||
eoAccountType byte = 3 | |||
singleSlotType byte = 4 | |||
skipListType byte = 8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is set in bit order, because I'm hoping to mix encodings in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eoAccountType
has value 3 which isn't respecting that idea. Is it worth changing now? If not, we're ina a half-baked idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could change it to 4, no long-term db has been using this.
// Code hash should be the empty code hash | ||
isEoA = v != nil && bytes.Equal(v, EmptyCodeHash[:]) | ||
case 4: | ||
// Code size must be 0 | ||
isEoA = v != nil && bytes.Equal(v, zero32[:]) | ||
default: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the EoA fix for Nyota.
proof_test.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that the changes in this PR are due to me building on top of two of your commits, which I need for my replay experiments.
k[StemSize] = ins.Suffix | ||
keys = append(keys, k[:]) | ||
prevalues = append(prevalues, nil) | ||
postvalues = append(postvalues, slices.Clone(ins.New[:])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This, and line 410, are the fixes for the test that was broken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird, so this means that stemdiff
is somewhat mutated after affecting the returned Proof
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the area that is used to store the location of the iterator is reused over and over, and if we pass a reference to it, it will change as the iterator progress. This is not the first time we fall for this, and probably not the last time either, it's quite subtle and easy to forget.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, left some comments for your consideration.
k[StemSize] = ins.Suffix | ||
keys = append(keys, k[:]) | ||
prevalues = append(prevalues, nil) | ||
postvalues = append(postvalues, slices.Clone(ins.New[:])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird, so this means that stemdiff
is somewhat mutated after affecting the returned Proof
?
@@ -170,6 +170,7 @@ const ( | |||
leafType byte = 2 | |||
eoAccountType byte = 3 | |||
singleSlotType byte = 4 | |||
skipListType byte = 8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eoAccountType
has value 3 which isn't respecting that idea. Is it worth changing now? If not, we're ina a half-baked idea.
for _, gap := range gaps { | ||
if gap.Count == 0 { | ||
break // skip the last gap as nothing follows | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we do:
for j:=0; j<=gapCount; j++ {
and avoid if
L1852 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not completely, because if Count == 0
it means that the group ends with a gap, wherease if Count != 0
then the group ends with a range. But yeah, no point in going over the whole list, I'll loop on j
.
encoding.go
Outdated
rangecount := serialized[offset+1] | ||
gapsize := serialized[offset] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename gapsize
and valueIdx
to rangeSkip
and rangeCount
, as to use the same names in the serialization Skip
and Count
?
Using gap
or range
worlds can be confusing. Using the same names as the serialization makes easier to understand the both ways of the algorithm.
leafIdx += int(gap.Skip) | ||
result = append(result, byte(gap.Count)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking there can't be a 0-length range, so we should encode Count-1
instead of forcing the wraparound, which might not work on every platform.
leafIdx += int(gap.Skip) | |
result = append(result, byte(gap.Count)) | |
leafIdx += int(gap.Skip) | |
// count can be 256, bytes(256) == 0 which is used as a marker for a full leaf | |
result = append(result, byte(gap.Count)) |
Signed-off-by: Guillaume Ballet <[email protected]>
Signed-off-by: Guillaume Ballet <[email protected]>
@@ -56,6 +56,9 @@ const ( | |||
leafC1CommitmentOffset = leafCommitmentOffset + banderwagon.UncompressedSize | |||
leafC2CommitmentOffset = leafC1CommitmentOffset + banderwagon.UncompressedSize | |||
leafChildrenOffset = leafC2CommitmentOffset + banderwagon.UncompressedSize | |||
leafSkipListCOffset = nodeTypeSize + StemSize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: leafSkipListC0Offset
values[valueIdx] = serialized[offset : offset+leafSlotSize] | ||
|
||
// shortcut: the leaf is full and so both values are 0. | ||
if serialized[offset] == 0 && serialized[offset+1] == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this worth it? If the node is full, doesn't sound we should serialize with skip list really. Sounds wasteful.
Signed-off-by: Guillaume Ballet <[email protected]>
Instead of using a bitmap, use two bytes:
There are ~200M accounts in the state, for which we use at most 6 bytes instead of 32 for a bitmap. This is expected to save (32 - 6) * 200 ~ 5GB
There are further optimizations that will not be handled by this PR, but that I'm writing down here so that I don't forget:
This PR also contains some fixes for the EoA encoding, that got broken after the Nyota costs.